Introduction

Below is the general workflow for replicating our analyses of trait evolution in Tiliquini skinks. While we provide the scripts to redo everything from scratch, we also include Rdata files that will give you the output from model fits, objects, etc., to speed up the process.

Load some necessary packages. We’ll need others later but will add them as we go.

library(ape)
library(phytools)
library(ggplot2)
library(patchwork)
library(dplyr)

Let’s save some time and start by loading our data.

load("Egernia_Data.RData", verbose=T)
## Loading objects:
##   all.modules
##   egernia.tree
##   etree
##   egernia.tree.alt
##   lessLSR
##   smallLSR

And we’ll have a look at the tree we’re going to use.

plot(egernia.tree, cex=0.5); axisPhylo()

The all.modules object is a list of data frames that hold all the trait data. We’ll quickly put all our traits into a single data frame for use later.

all.traits <- bind_cols(all.modules$head, all.modules$body, all.modules$tail, all.modules$limb,
                        all.modules$size, all.modules$neck, all.modules$eye)

And load some functions we’ll want later

source("../Scripts/trait.at.time.R")

Determining Morphological Modules

Let’s think about how the body plan might be split up into modules that evolve under different pressures, rates, and modes. Individual traits might covary with others as a result of development or selective pressures or many other reasons. These covarying traits may together constitute modules, e.g. snout length (log-shape ratios) and head with might be part of a common ‘head’ module. We can quickly visualize how the size-corrected traits covary generally.

library(corrplot)
## corrplot 0.92 loaded
colorder <- c("Head_Width", "Head_Depth", "Pos_Skull", "Snout_Eye", "Eye_Diameter",
              "Body_Width", "Interlimb", "Pelvic_Height", "Pelvic_Width",
              "Upper_Arm", "Lower_Arm", "Hand", "Upper_Leg", "Lower_Leg", "Foot",
              "Tail_Width", "Tail_Length",
              "Neck", "Size")
corrLSR <- all.traits[,colorder]

corr.small <- as.matrix(cor(corrLSR))
#corrplot::corrplot.mixed(corr.small, lower="ellipse", upper="number")
corrplot(corr.small)

You’ll notice that some groups of traits (hand/foot/leg elements) show strong positive covariance. Same for others like head width/depth.

We can create some models to test basic modularity hypotheses.
We’ll use the R package EMMLi to identify morphological modules of the lizard body.

library(EMMLi)

We’ve designed some basic models to test.
Remember: no module can be made up of only a single trait! If it is, code it as NA

model.small <- read.csv("Morph_Module_Models.csv", header=T)
res <- EMMLi(corr = corr.small, 
      N_sample = 56, 
      mod = model.small,
      abs = T,
      pprob = 0.05); 
res

Based on our preferred module model we have saved our traits into the all.modules object.
The preferred model corresponds to 4 modules (head, body, tail, limb). Three traits (neck, eye, pelvis) are unintegrated and do not fit among the other modules. In the process of size-correcting our traits we also generate a new trait size, so we’ll keep that.

names(all.modules)
## [1] "head"   "body"   "tail"   "limb"   "size"   "neck"   "eye"    "pelvis"
head(all.modules$head)
##                              Head_Width Snout_Eye Head_Depth Pos_Skull
## Tribolonotus_ponceleti        1.1881587 0.3631190  0.8845977 0.7614419
## Tribolonotus_pseudoponceleti  1.0713812 0.3390085  0.7727104 0.7987440
## Tribolonotus_blanchardi       0.8335979 0.3098250  0.6311908 0.6640375
## Tribolonotus_gracilis         1.1688530 0.3428757  0.9267157 0.7107731
## Corucia_zebrata               1.0718004 0.3011819  0.8046630 0.6062986
## Lissolepis_luctuosa           0.8110920 0.4167810  0.7283571 0.6141436

Running l1ou on Morphological Modules

We might want to look at fitting multi-OU models to our modules. Let’s fit l1ou to all the modules and save the output.

l1ou.modules <- list(); l1ou.trees <- list()
#for (y in 1:length(all.modules)){
for (y in 1:4){
  # Align the trait data and tree
  trait.data <- adjust_data(egernia.tree, all.modules[[y]])
  #### Estimate the number and position of shifts a priori 
  fit <- estimate_shift_configuration(trait.data$tree, trait.data$Y, 
                                      nCores=8, quietly=F, criterion="AIC") 
  # if you want to do only a single trait, 'data$Y[,x]'
  # test for convergent regimes if you'd like (this is slow, and not really necessary)
  #shift.fit <- estimate_convergent_regimes(fit, nCores=8, criterion="BIC")
  # save the results to a list
  #l1ou.modules[[y]] <- shift.fit
  l1ou.modules[[y]] <- fit
  # save the simmap trees to a list
  #l1ou.trees[[y]] <- shifts.to.simmap.l1ou(shift.fit)
  l1ou.trees[[y]] <- shifts.to.simmap.l1ou(fit)
  #plot(l1ou.trees[[y]])
}
names(l1ou.modules) <- names(all.modules)[1:4]
names(l1ou.trees) <- names(all.modules)[1:4]

If you ran the above, save these results externally

save(l1ou.modules, l1ou.trees, file="l1ou_MorphModules.RData")

If you don’t have the time to run these, load them instead

load("l1ou_MorphModules.RData", verbose=T)
## Loading objects:
##   l1ou.modules
##   l1ou.trees

Plot all the trees with shifts

source("../Scripts/shifts.to.simmap.l1ou.R")
par(mfrow=c(3,3))
for(j in 1:length(l1ou.trees)){
  plotColorSimmap(l1ou.trees[[j]]); axisPhylo(); title(paste(names(l1ou.trees)[[j]], (ncol(l1ou.trees[[j]]$mapped.edge)-1), "shifts"))
}
## no colors provided. using the following legend:
##         0         1         2         3         4         5         6         7 
##   "black" "#f7e89b" "#edb09c" "#953ec4" "#351887" "#e4f298" "#6e40ed" "#a3f49f" 
##         8         9 
## "#80b72d" "#9bf7ab"
## no colors provided. using the following legend:
##         0         1         2         3         4         5         6 
##   "black" "#96a1e8" "#d8874e" "#11c43b" "#d350f4" "#cea623" "#76db69"
## no colors provided. using the following legend:
##         0         1         2         3         4         5         6 
##   "black" "#c1482a" "#44f48d" "#d9ed04" "#cc6e4f" "#7429e5" "#fcdb6f"
## no colors provided. using the following legend:
##         0         1         2         3         4         5         6         7 
##   "black" "#bced8e" "#e06716" "#bfe079" "#077f15" "#ef13d5" "#ef540b" "#e04ae0" 
##         8 
## "#ffc9f4"


Fitting Continuous Models of Trait Evolution

Common and Lévy Models

We’ll start by fitting a standard set of comparative models (BM, OU, EB) and Levy models in pulsR.
Code for running those models across all traits individually is contained in RUN_pulsR.R. For the sake of brevity, we won’t run the models here because they take a while.

Variable Rates Model

We can use BayesTraits to estimate branch specific rates of trait evolution. This method uses a variable-rates Brownian Motion model, and is run externally to R. We can read in our results though. The next step is to run BayesTraits across all the traits individually. Like any Bayesian method, multiple independent chains should be run and compared to assess convergence.

Begin by saving each trait to a BayesTraits formatted file (following SaveTraitsToFiles_BayesTraits.R)

for(j in 1:length(all.modules)){
  curr.mod <- all.modules[[j]]
  for (k in 1:length(curr.mod)){
    trait <- select(curr.mod, k)
    tname <- colnames(trait)
    filename <- paste0("~/Desktop/", tname, ".txt")
    write.table(trait, file=filename, sep=" ", quote=F, col.names=F)
  }
}

Format your BayesTraits control file.

7 # independent contrasts
2 # MCMC (Markov Chain Monte Carlo)
VarRates # specify the Variable Rates model
Burnin 10000000 # give the MCMC plenty of burn-in
Iterations 110000000 # total generation length (Iterations - Burnin = Sample)
Run # tell it to run the analysis

Once you’ve run all those analyses and confirmed convergence/stationarity, the next step will be to process the BayesTraits outputs. This can be slightly complicated without a helper file. Use the process_PPP function in plotting_BayesTraits.R to process the outputs. You can do this following the RUN_BayesTraits_VarRates_Likelihoods.R script. The same script will also allow you to extract comparable AIC values for model comparison.

Model Comparison

Now that we’ve fit all the models, get the model fits together so we can compare them.

load("pulsR_Results.Rdata", verbose=T)
## Loading objects:
##   trait.fit
##   params.list
##   trait.aicw.all
##   trait.fit.types
##   trait.aicw.all
##   trait.aicw.types
##   aicw.df
##   aicw.all.df

Since we have the objects already, plot them here.

# plot the 'type'  results (lumps levy models together)
aicw.types.plot <- ggplot(aicw.df, aes(fill=model, y=aicw, x=trait)) + 
  geom_bar(position="stack", stat="identity") + theme_classic() + 
  scale_fill_brewer(palette="RdYlBu") +
  theme(axis.text.x = element_text(angle = 45, hjust=1), legend.position="bottom")


# plot all the AICw (plots all levy models separately)
aicw.all.plot <- ggplot(aicw.all.df, aes(fill=model, y=aicw, x=trait)) + 
  geom_bar(position="stack", stat="identity") + theme_classic() + 
  scale_fill_brewer(palette="RdYlBu") +
  theme(axis.text.x = element_text(angle = 45, hjust=1), legend.position="bottom")

# plot the two together
aicw.types.plot / aicw.all.plot

Lastly we want to be able to tell if there are specific models that are favored for individual traits. We designate a model as ‘preferred’ if it has 2x the AICw of the next best model

sig.fit <- NULL
for(k in 1:length(trait.aicw.types)){
  curr.res <- sort(trait.aicw.types[[k]], decreasing=T)
  if(curr.res[[1]]/curr.res[[2]] > 2){sig.fit <- append(sig.fit, names(curr.res)[[1]])}
  if(curr.res[[1]]/curr.res[[2]] <= 2){sig.fit <- append(sig.fit, "none preferred")}
}
names(sig.fit) <- names(trait.aicw.types)
data.frame(Preferred_Model = sig.fit[order(names(sig.fit))])
##               Preferred_Model
## Body_Width           VarRates
## Eye_Diameter         VarRates
## Foot                 VarRates
## Hand           none preferred
## Head_Depth                 BM
## Head_Width           VarRates
## Interlimb                Jump
## Lower_Arm            VarRates
## Lower_Leg      none preferred
## Neck           none preferred
## Pelvic_Height  none preferred
## Pelvic_Width         VarRates
## Pos_Skull                Jump
## Size                 VarRates
## Snout_Eye                  BM
## Tail_Length          VarRates
## Tail_Width     none preferred
## Upper_Arm            VarRates
## Upper_Leg            VarRates

Visualizing Variable Rates Models

We will want to be able to visualize the results of our VR models.

load("BayesTraits_processed_traits.RData", verbose=T)
## Loading objects:
##   all.BT
names(all.BT)
## [1] "all.trees"         "all.sig"           "all.res"          
## [4] "scalar.trees"      "mean.scalar.trees" "rate.trees"

Here’s an explanation of what each of those elements is:
1. all.trees holds a tree for each trait. The tree is your input tree, rescaled by the mean estimated sigma (rate) value per branch (Mean.SigV in BayesTraits terms).
2. all.sig holds the mean sigma value per branch in a vector (?), this is the value each branch above was scaled by.
3. all.res holds a data frame for each trait. The data frame includes the node and branch information, rate info, etc.
4. scalar.trees holds a tree for each trait. The tree is your input tree, rescaled by the median scalar r (Median.Scalar in BayesTraits terms).
5. mean.scalar.trees holds a tree for each trait. The tree is your input tree, rescaled by the mean scalar r (Mean.Scalar in BayesTraits terms).
6. rate.trees holds a tree for each trait. The tree is your input tree, with each branch rescaled to the mean estimated sigma value (Mean.SigV).

Plot BayesTraits Trees with Shifts

To visualize the rates and shifts across the tree use the plot.VarRates.tree function.

source("../Scripts/plotting_BayesTraits.R")
plot.VarRates.tree(BT = all.BT$all.res$BodyWidth, # res object for focal trait
                   phy = all.BT$scalar.trees$BodyWidth, # tree for focal trait
                   col.palette = "YlGnBu", # choose your color palette
                   legend = T, # want a rate legend included?
                   tree.type = "phylogram", # tree shape
                   log.rates = T, # should rates be logged (probably)
                   trait = "Body Width", # name of the trait
                   outline = F, # black outline on branches for light colors
                   pos.selection = T, # indicate positive selection (r >= 10)
                   shift.rates = T) # indicate shifted rates (r >= 2)

We can also apply this across all the traits simultaneously.

# plot all the trees with edges colored by rates
par(mfrow=c(2,2))
# plot the scalar transformed trees
for(j in 1:4){plot.VarRates.tree(BT=all.BT$all.res[[j]], phy=all.BT$scalar.trees[[j]], col.palette="YlOrRd", legend=F, tree.type="phylogram", log.rates=T, trait=names(all.BT$all.res)[[j]], outline=F, pos.selection=T, shift.rates=T)}

Plot Rate and Disparity Through Time

Start by reading in the Median Scalar Trees (all.BT$scalar.trees) from file if you’d like. We’ll need these for most of the visualizations going forward.

all.trees <- read.nexus(file="../Trees/BayesTraits_VarRates_Egerniinae_MorphTraits_Scalar.trees")
plot(all.trees$Body_Width, show.tip.label = F); axisPhylo()

# sort the order of the trees to match the columns
all.trees <- all.trees[colnames(all.traits)]

Disparity Through Time

We’ll extract the ancestral trait values for each trait

anc.traits <- NULL; anc.all.traits <- NULL
for (k in 1:length(all.trees)){
  base.trait <- all.traits[,k]; names(base.trait) <- rownames(all.traits)
  anc.traits[[k]] <- fastAnc(all.trees[[k]], base.trait)
  anc.all.traits[[k]] <- c(base.trait, anc.traits[[k]])
  #phenogram(egernia.tree, x=c(base.trait, anc.trait), fsize=0.5)
}
names(anc.traits) <- names(all.trees)
names(anc.all.traits) <- names(all.trees)

Or, we can just read in all these bits from file.

load("Tiliquini_Ancestral_MorphTraits.Rdata", verbose=T)
## Loading objects:
##   all.traits
##   anc.traits
##   anc.all.traits
##   all.trees

all.traits is a data frame of the traits for all living taxa.
anc.traits is a named list of vectors of ancestral traits.
anc.all.traits is a named list of vectors of traits of living taxa and ancestors.
all.trees is a named multiPhylo object of Median Scalar trees.

Anyway, let’s move on and get the trait value along branches at given time intervals (here 0.25 million years). This will allow us to plot the traits through time.

all.anc <- lapply(1:length(anc.all.traits), function(x) trait.at.time(timeslices=0.25, phy=egernia.tree, trait.vector=anc.all.traits[[x]], plot=F))
names(all.anc) <- names(anc.all.traits)
# have a look at what the data look like
head(all.anc$Head_Width)
##     time     trait
## 58  0.00 0.9764456
## 61  0.00 0.9764456
## 581 0.25 0.9769712
## 611 0.25 0.9761035
## 582 0.50 0.9774968
## 612 0.50 0.9757614

Next we can extract the disparity across all living branches at given time intervals (same as the rate above). I call this Disparity-at-Time.

emp.dis <- lapply(1:length(all.anc), function(x) extract.variance(all.anc[[x]], plot = FALSE, metric = "disparity"))
names(emp.dis) <- names(all.anc)
emp.dis <- lapply(1:length(emp.dis), function(x) emp.dis[[x]] <- cbind(emp.dis[[x]], trait=names(emp.dis)[[x]]))
names(emp.dis) <- names(all.anc)
emp.dis.df <- NULL
for(k in 1:length(emp.dis)){emp.dis.df <- rbind(emp.dis.df, emp.dis[[k]])}

Show a single disparity plot

plot(emp.dis$Head_Width$measure ~ emp.dis$Head_Width$time, 
         xlab="Time", ylab=paste("Total", "Disparity"), type="l", main="Head Width")

    #abline(v=30.06, col="#ABDDA4", lty=3, lwd=3)
    plot(emp.dis$Head_Width$measure.rich ~ emp.dis$Head_Width$time, 
         xlab="Time", ylab=paste("Relative", "Disparity"), type="l", main="Head Width")

    layout(matrix(1))

Now that we have the disparity values through time we can plot them for each trait.

ggplot(emp.dis.df) +
  geom_line(aes(x=time, y=measure, color=trait)) +
  facet_wrap(~trait, scales="free") + 
  theme_classic() + theme(legend.position="none")

Rate Through Time

Ok let’s apply the same principles to the rates. I’ll call this Rate-at-Time

# annoying, but to get the right order we have to rename the rate objects
sig.BT <- all.BT$all.sig
# reorder the objects
for(k in 1:length(sig.BT)){names(sig.BT[[k]])[1:56] <- egernia.tree$tip.label}
for(k in 1:length(sig.BT)){sig.BT[[k]] <- append(sig.BT[[k]], 0, after=Ntip(egernia.tree))}
for(k in 1:length(sig.BT)){names(sig.BT[[k]])[[57]] <- 57}
# actually extract the rates-at-time
all.RaT <- lapply(1:length(sig.BT), function(x) rate.at.time(timeslices=0.25, phy=egernia.tree, rate.vector=sig.BT[[x]]))
names(all.RaT) <- names(sig.BT)
# extract the 95% quantile around the estimated rate
all.rates <- lapply(1:length(all.RaT), function(x) extract.stat(all.RaT[[x]], stat="mean", plot = F, range="quantile"))
names(all.rates) <- names(all.RaT)
# combine the mean and quantiles into a single data frame
all.rates <- lapply(1:length(all.rates), function(x) all.rates[[x]] <- cbind(all.rates[[x]], trait=names(all.rates)[[x]]))
names(all.rates) <- names(all.RaT)
# reshape into a new dataframe
all.rates.df <- NULL
for(k in 1:length(all.rates)){all.rates.df <- rbind(all.rates.df, all.rates[[k]])}

Now that we have the rates through time for each trait, plot it.

ggplot(all.rates.df) +
  geom_ribbon(aes(x=time, ymin=`5%`, ymax=`95%`, fill=trait, alpha=0.5)) +
  geom_line(aes(x=time, y=rate, color=trait)) +
  facet_wrap(~trait, scales="free") + 
  theme_classic() + theme(legend.position="none")

Here we can compare the rates through time for several traits. As an example I show the rates for the head module traits. This follows the right panel of Figure 3.

library(ggbreak)
all.rates.df$`5%`[which(all.rates.df$`5%`<=0)] <- 0
head.rates.df <- filter(all.rates.df, trait %in% c("HeadDepth","HeadWidth","PosSkull","EyeDiameter"))
p1 <- ggplot(head.rates.df) +
  geom_ribbon(aes(x=rev(time), ymin=`5%`, ymax=`95%`, fill=trait), alpha=0.1) +
  geom_line(aes(x=rev(time), y=rate, color=trait)) +
  scale_color_brewer(palette="Spectral") +
  scale_fill_brewer(palette="Spectral") +
  scale_x_reverse() +
  #scale_y_log() +
  xlab("Time (Ma)") + ylab("Evolutionary Rate") +
  theme_classic() + theme(legend.position="bottom")

p1


Visualizing the Accumulation of Disparity

We want to visualize the accumulation of disparity compared to a Brownian Motion null. We’ll do this in both the multivariate and univariate cases, and use that to understand temporal trends in morphospace exploration.

Start by sourcing a couple bits of code we’ll need.

source("../Scripts/disparity.to.BM.R")
source("../Scripts/trait.at.time.R")

Disparity of Modules

load("BayesTraits_processed_modules.RData", verbose=T)
## Loading objects:
##   all.BT
names(all.BT)
## [1] "all.trees"         "all.sig"           "all.res"          
## [4] "scalar.trees"      "mean.scalar.trees" "rate.trees"

Extract information about the empirical accumulation of disparity in modules compared to a Brownian Motion null with correlated trait evolution.

# HEAD
head.corBM <- disparity.to.BM.module(phy=egernia.tree, scaled.phy=all.BT$scalar.trees$Head, 
                                     trait.df=all.modules$head, sim.num=100, metric="variance", 
                                     trait.name="Head", loess.span=0.1, window.size=20, trait.correlation=T)
# BODY
body.corBM <- disparity.to.BM.module(phy=egernia.tree, scaled.phy=all.BT$scalar.trees$Body, 
                                     trait.df=all.modules$body, sim.num=100, metric="variance", 
                                     trait.name="Body", loess.span=0.1, window.size=20, trait.correlation=T)
# TAIL
tail.corBM <- disparity.to.BM.module(phy=egernia.tree, scaled.phy=all.BT$scalar.trees$Tail, 
                                     trait.df=all.modules$tail, sim.num=100, metric="variance", 
                                     trait.name="Tail", loess.span=0.1, window.size=20, trait.correlation=T)
# LIMB
limb.corBM <- disparity.to.BM.module(phy=egernia.tree, scaled.phy=all.BT$scalar.trees$Limb, 
                                     trait.df=all.modules$limb, sim.num=100, metric="variance", 
                                     trait.name="Limb", loess.span=0.1, window.size=20, trait.correlation=T)

To save time, let’s just load saved versions of the above objects.

load("Disparity_processed_modules.RData")
# each object should have the following 6 fields
names(head.corBM)
## [1] "disparity"        "slopes"           "simulated.metric" "empirical.metric"
## [5] "slopes.df"        "sim.vat"

Plot the accumulation curves and difference of slopes independently for each module.

# Plot all the disparity (variance) curves and difference in slopes
  (head.corBM$disparity | head.corBM$slopes | plot_spacer()) /
  (limb.corBM$disparity | limb.corBM$slopes | plot_spacer()) /
  (tail.corBM$disparity | tail.corBM$slopes | plot_spacer()) /
  (body.corBM$disparity | body.corBM$slopes | plot_spacer())

Now combine the separate disparity summaries (empirical, simulated, slopes) into workable dataframes.

all.disparity <- rbind(cbind(limb.corBM$empirical.metric, trait="limb"),
                       cbind(head.corBM$empirical.metric, trait="head"),
                       cbind(body.corBM$empirical.metric, trait="body"),
                       cbind(tail.corBM$empirical.metric, trait="tail"))
all.sim.disparity <- rbind(cbind(limb.corBM$simulated.metric, trait="limb"),
                           cbind(head.corBM$simulated.metric, trait="head"),
                           cbind(body.corBM$simulated.metric, trait="body"),
                           cbind(tail.corBM$simulated.metric, trait="tail"))
all.slope <- rbind(cbind(limb.corBM$slopes.df, trait="limb"),
                   cbind(head.corBM$slopes.df, trait="head"),
                   cbind(body.corBM$slopes.df, trait="body"),
                   cbind(tail.corBM$slopes.df, trait="tail"))

Plot the empirical vs. simulated disparity

disparity.plot <- ggplot() +
  #geom_ribbon(data=all.sim.disparity, aes(x=rev(time), ymin=`5%`, ymax=`95%`, fill=trait), alpha=0.1) +
  #scale_fill_brewer(palette="Spectral") +
  geom_line(data=all.sim.disparity, aes(x=rev(time), y=`50%`, color=trait), alpha=0.75, linetype="dotted") +
  geom_line(data=all.disparity, aes(x=rev(time), y=measure, color=trait)) +
  scale_color_brewer(palette="Spectral") + xlab("Time (Ma)") + ylab("Disparity (variance)") +
  scale_x_reverse() + theme_classic() + theme(legend.position="bottom")

Plot the difference in slopes of empirical vs. simulated disparity

slope.plot <- ggplot() +
  geom_abline(slope=0, intercept=0, color="black", linetype="dotted") +
  geom_ribbon(data=all.slope, aes(x=rev(time),ymin=slope.lower,ymax=slope.upper, fill=trait), alpha=0.1) +
  scale_fill_brewer(palette="Spectral") +
  geom_line(data=all.slope, aes(x=rev(time), y=slope.diff, color=trait)) +
  scale_color_brewer(palette="Spectral") + 
  xlab("Time (Ma)") + ylab("Difference in Slopes (observed - BM") +
  scale_x_reverse() + theme_classic() + theme(legend.position="bottom")

Combine the plots

(disparity.plot + slope.plot + plot_spacer()) + plot_layout(guides="collect") & theme(legend.position="bottom")

What this shows:
On the left we see the accumulation of disparity as variance of empirical trends in the modules (solid lines) compared to a correlated Brownian Motion null model. On the right we visualize the difference in slopes (through time) between the empirical and simulated trends. Values above 0 indicate periods of morphospace expansion—disparity is accumulating greater in the empirical system than in simulations under BM—and values below 0 indicate periods of morphospace conservatism/packing—disparity is accumulating slower in the empirical system than in simulations under BM.

Disparity of Individual Traits

We should be able to do this similarly for individual traits.

load("BayesTraits_processed_traits.RData", verbose=T)
## Loading objects:
##   all.BT
names(all.BT)
## [1] "all.trees"         "all.sig"           "all.res"          
## [4] "scalar.trees"      "mean.scalar.trees" "rate.trees"

We’ll run the same type of analyses on traits within a module. Because we don’t have a saved object like for the modules let’s run a small number of simulations (10), but if you were doing this for real you’d want to increase sim.num > 100.

# put our traits into individual named vectors
traits.vec <- lapply(1:20, function(x) setNames(smallLSR[,x], rownames(smallLSR)))
names(traits.vec) <- colnames(smallLSR)[1:length(traits.vec)]

# process each trait
# HEAD traits 
hd.var <- disparity.to.BM(phy=egernia.tree, scaled.phy=all.BT$scalar.trees$HeadDepth,
                          trait=traits.vec$Head_Depth, sim.num=10, metric="variance", 
                          trait.name="Head Depth", loess.span=0.1, window.size=20)
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
hw.var <- disparity.to.BM(phy=egernia.tree, scaled.phy=all.BT$scalar.trees$HeadWidth,
                          trait=traits.vec$Head_Width, sim.num=10, metric="variance", 
                          trait.name="Head Width", loess.span=0.1, window.size=20)
ps.var <- disparity.to.BM(phy=egernia.tree, scaled.phy=all.BT$scalar.trees$PosSkull,
                          trait=traits.vec$Pos_Skull, sim.num=10, metric="variance", 
                          trait.name="Pos Skull", loess.span=0.1, window.size=20)
se.var <- disparity.to.BM(phy=egernia.tree, scaled.phy=all.BT$scalar.trees$SnoutEye,
                          trait=traits.vec$Snout_Eye, sim.num=10, metric="variance", 
                          trait.name="Snout Eye", loess.span=0.1, window.size=20)

Now combine the separate disparity summaries (empirical, simulated, slopes) into workable dataframes.

all.disparity <- rbind(cbind(hd.var$empirical.metric, trait="head depth"),
                       cbind(hw.var$empirical.metric, trait="head width"),
                       cbind(ps.var$empirical.metric, trait="posterior skull"),
                       cbind(se.var$empirical.metric, trait="snout eye"))
all.sim.disparity <- rbind(cbind(hd.var$simulated.metric, trait="head depth"),
                           cbind(hw.var$simulated.metric, trait="head width"),
                           cbind(ps.var$simulated.metric, trait="posterior skull"),
                           cbind(se.var$simulated.metric, trait="snout eye"))
all.slope <- rbind(cbind(hd.var$slopes.df, trait="head depth"),
                   cbind(hw.var$slopes.df, trait="head width"),
                   cbind(ps.var$slopes.df, trait="posterior skull"),
                   cbind(se.var$slopes.df, trait="snout eye"))

Plot the empirical vs. simulated disparity

disparity.plot <- ggplot() +
  #geom_ribbon(data=all.sim.disparity, aes(x=rev(time), ymin=`5%`, ymax=`95%`, fill=trait), alpha=0.1) +
  #scale_fill_brewer(palette="Spectral") +
  geom_line(data=all.sim.disparity, aes(x=rev(time), y=`50%`, color=trait), alpha=0.75, linetype="dotted") +
  geom_line(data=all.disparity, aes(x=rev(time), y=measure, color=trait)) +
  scale_color_brewer(palette="Spectral") + xlab("Time (Ma)") + ylab("Disparity (variance)") +
  scale_x_reverse() + theme_classic() + theme(legend.position="bottom")

Plot the difference in slopes of empirical vs. simulated disparity

slope.plot <- ggplot() +
  geom_abline(slope=0, intercept=0, color="black", linetype="dotted") +
  geom_ribbon(data=all.slope, aes(x=rev(time),ymin=slope.lower,ymax=slope.upper, fill=trait), alpha=0.1) +
  scale_fill_brewer(palette="Spectral") +
  geom_line(data=all.slope, aes(x=rev(time), y=slope.diff, color=trait)) +
  scale_color_brewer(palette="Spectral") + 
  xlab("Time (Ma)") + ylab("Difference in Slopes (observed - BM") +
  scale_x_reverse() + theme_classic() + theme(legend.position="bottom")

Combine the plots

(disparity.plot + slope.plot + plot_spacer()) + plot_layout(guides="collect") & theme(legend.position="bottom")


Variable Rates of Modules

Up until now we’ve been looking at rates and disparity of individual traits. We’ll shift now to looking at trends in modules (head, body, tail, limbs).

Rate Trajectories (Rate/Trait-grams)

source("../Scripts/rate.trajectory.R")
## Loading required package: tibble
load("BayesTraits_processed_modules.RData", verbose=T)
## Loading objects:
##   all.BT
names(all.BT)
## [1] "all.trees"         "all.sig"           "all.res"          
## [4] "scalar.trees"      "mean.scalar.trees" "rate.trees"
# Rate Trajectory and Rate-to-Node for HEAD
HEAD.anc.module <- data.frame(Head_Width = anc.all.traits$Head_Width, 
                              Snout_Eye = anc.all.traits$Snout_Eye, 
                              Head_Depth = anc.all.traits$Head_Depth, 
                              Pos_Skull = anc.all.traits$Pos_Skull)

HEAD.plot <- rate.trajectory.BT.module(tree = egernia.tree, module = HEAD.anc.module,
                                       PPP.all.res = all.BT$all.res$Head,
                                       tip.spread = c("Tiliqua_rugosa","Tiliqua_scincoides"),
                                       focus = "clade", psize=3, lsize=2, background.color="grey",
                                       gimme.the.data=F, inset=T)

We’ve run each of these modules through rate.trajectory in the script PLOT_rate.trajectory.R.

load("rate.trajectories.RData", verbose=T)
## Loading objects:
##   rate.list
load("BayesTraits_ancestral_modules.RData", verbose=T)
## Loading objects:
##   anc.modules
names(rate.list); names(anc.modules)
## [1] "head" "body" "tail" "limb"
## [1] "head" "body" "tail" "limb"

The rate.list list is made up of processed data frames which hold information about the tree, BayesTraits results (rate/scalar estimates, shift frequency in posterior, etc.), color scheme.
The anc.modules list is made up of data frames holding the observed and estimated trait values for all tips and internal nodes of the tree.

Using the rate.trajectory.BT.module function we can plot a traitgram/phenogram style with branches colored by evolutionary rate for any module and any clade (set of tips) or single tip of the tree. I’ll show you how.

Here’s a traitgram for the limb module highlighting the trajectory of the Tiliqua/Cyclodomorphus clade. You will notice the dark color on the stem branch of this clade suggesting a shift in evolutionary rate.

rate.trajectory.BT.module(tree = egernia.tree, module = anc.modules$limb,
                                 PPP.all.res = rate.list$limb,
                                 tip.spread = c("Tiliqua_rugosa","Cyclodomorphus_maximus"),
                                 focus = "clade", psize=3, lsize=2, background.color="grey",
                                 gimme.the.data=F, inset=T)

We can limit this to just a single tip, say Tiliqua rugosa like this:

rate.trajectory.BT.module(tree = egernia.tree, module = anc.modules$limb,
                                 PPP.all.res = rate.list$limb,
                                 tip.spread = c("Tiliqua_rugosa"),
                                 focus = "tip", psize=3, lsize=2, background.color="grey",
                                 gimme.the.data=F, inset=T)

We can also pick an entirely different clade, one that shows no sign of rate shifts. You’ll notice how consistent the rates are.

rate.trajectory.BT.module(tree = egernia.tree, module = anc.modules$limb,
                                 PPP.all.res = rate.list$limb,
                                 tip.spread = c("Liopholis_whitii", "Liopholis_personata"),
                                 focus = "clade", psize=3, lsize=2, background.color="grey",
                                 gimme.the.data=F, inset=T)

Rate-to-Node Trajectories

It’s great that we can show the rates along individual branches, but we might want to look at the rates of branches with inferred rate shifts against the background rate.

Since we already have the processed BayesTraits output files (in rate.list), we can just plug them into the rate.to.node.BT function.

Head Module

rate.to.node.BT(phy=egernia.tree, PPP.obj=rate.list$head, psize=3, lsize=2, log.rates=T, col.palette="YlOrRd")

Body Module

rate.to.node.BT(phy=egernia.tree, PPP.obj=rate.list$body, psize=3, lsize=2, log.rates=T, col.palette="YlOrRd")

Tail Module

rate.to.node.BT(phy=egernia.tree, PPP.obj=rate.list$tail, psize=3, lsize=2, log.rates=T, col.palette="YlOrRd")

Sim-to-Node and Morpho-Trajectories

Last we’ll do the morphotrajectory and sim-to-node plots.

source("../Scripts/morphotrajectory.R")

This will allow us to visualize the expansion of morphospace by comparing our empirical observations (and estimated ancestral values) against a set of Brownian Motion simulations.

We can start by looking at a remarkable example of Egernia stokesii tail proportions. Here we can see that the combination of tail traits for this species fall outside what we would expect from BM.

sim.to.node(phy=egernia.tree, VRphy=all.BT$scalar.trees$Tail, 
            trait=all.modules$tail[,c("Tail_Length","Tail_Width")],
            tip="Egernia_zellingi", sim.num=500)

In comparison we can plot what the evolution of tail traits looks like for something quite standard, like Liopholis whitii. Here we see that the tail traits for this lizard are quite conservative, compared to the eovlutionary possibility under Brownian Motion.

sim.to.node(phy=egernia.tree, VRphy=all.BT$scalar.trees$Tail, 
            trait=all.modules$tail[,c("Tail_Length","Tail_Width")],
            tip="Liopholis_whitii", sim.num=500)

The above plot shows us how trait variance accumulates under simulated BM and in our empirical data, and we can see that the distance in trait-space travelled between parent and child (ancestor and descendant) nodes is sometimes great. Let’s actually compare if some of those distances travelled are greater than we’d expect under BM.

body.dbn <- distance.btwn.nodes(phy = egernia.tree,
                                VRphy = all.BT$scalar.trees$Body,
                                trait = all.modules$body,
                                tip = "Cyclodomorphus_michaeli",
                                sim.num = 100,
                                stat = "quantile")

head.dbn <- distance.btwn.nodes(phy = egernia.tree,
                                VRphy = all.BT$scalar.trees$Head,
                                trait = all.modules$head,
                                tip = "Tiliqua_rugosa",
                                sim.num = 100,
                                stat = "quantile")

tail.dbn <- distance.btwn.nodes(phy = egernia.tree,
                                VRphy = all.BT$scalar.trees$Tail,
                                trait = all.modules$tail,
                                tip = "Egernia_zellingi",
                                sim.num = 100,
                                stat = "quantile")

limb.dbn <- distance.btwn.nodes(phy = egernia.tree,
                                VRphy = all.BT$scalar.trees$Limb,
                                trait = all.modules$limb,
                                tip = "Tiliqua_rugosa",
                                sim.num = 100,
                                stat = "quantile")

(limb.dbn + tail.dbn) /
  (body.dbn + head.dbn)

Here we can see that nodes that are significantly above 0 (including the gray shaded quantiles) show greater trait evolution than expected under BM diffusion. This suggests huge trait changes are sometimes confined to individual branches.


Elaboration and Innovation

When looking across many morphological traits, change along some axes seems to be more likely than others. This suggests morphological “lines of least resistance” or a major elaborative axis along which the majority of change occurs. In contrast, minor axes of change contribute to novel morphologies via innovative change (away from existing phenotypes). We can quantify and visualize these elaborative and innovative changes.

source("../Scripts/PLOT_innovate_elaborate.R")
load("BayesTraits_ancestral_traits.Rdata", verbose=T)
## Loading objects:
##   all.traits
##   anc.traits
##   anc.all.traits
##   all.trees
anc.all.df <- data.frame(anc.all.traits)

Elaboration and Innovation on Branches

The innovate.elaborate.phy function will do the following things: 1. perform PCA on all traits provided in the arg. trait.df 2. run a linear model on the two requested axes arg. PCs 3. recenter the regression through the estimated root values and correct the residuals 4. for each parent-child pair of nodes determine the euclidean distance and the angle of change 5. color branches as either elaborative (blues) or innovative (oranges) and scale the hue by the amount of change (lighter–less change; darker–more change)

# start with the full dataset
total.ie.phy <- innovate.elaborate.phy(trait.df = anc.all.df, 
                                       phy=egernia.tree, plot=T, summary=T, 
                                       angles="equal", PCs=c(1,2))
## Loading required package: ggfortify
## Registered S3 methods overwritten by 'useful':
##   method         from     
##   autoplot.acf   ggfortify
##   fortify.acf    ggfortify
##   fortify.kmeans ggfortify
##   fortify.ts     ggfortify
## Warning: `data_frame()` was deprecated in tibble 1.1.0.
## ℹ Please use `tibble()` instead.
## ℹ The deprecated feature was likely used in the useful package.
##   Please report the issue at <https://github.com/jaredlander/useful/issues>.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

## Importance of components:
##                           PC1    PC2     PC3     PC4     PC5     PC6     PC7
## Standard deviation     2.1060 0.9709 0.33748 0.17221 0.09342 0.08209 0.07279
## Proportion of Variance 0.7979 0.1696 0.02049 0.00534 0.00157 0.00121 0.00095
## Cumulative Proportion  0.7979 0.9675 0.98795 0.99328 0.99485 0.99607 0.99702
##                            PC8     PC9    PC10    PC11    PC12    PC13    PC14
## Standard deviation     0.06822 0.06383 0.04217 0.03716 0.03583 0.02978 0.02570
## Proportion of Variance 0.00084 0.00073 0.00032 0.00025 0.00023 0.00016 0.00012
## Cumulative Proportion  0.99786 0.99859 0.99891 0.99916 0.99939 0.99955 0.99967
##                           PC15    PC16    PC17    PC18    PC19
## Standard deviation     0.02496 0.02240 0.01971 0.01406 0.01173
## Proportion of Variance 0.00011 0.00009 0.00007 0.00004 0.00002
## Cumulative Proportion  0.99978 0.99987 0.99994 0.99998 1.00000

In case you were curious about how

total.rotation <- plot.rotations(total.ie.phy$pca, plot=T)
## Loading required package: reshape2

You might be interested in applying the framework to individual modules. Here we’ll look at what this looks like in the limb module.

limb.ie.phy <- innovate.elaborate.phy(trait.df = select(anc.all.df, Upper_Leg,Lower_Leg,Foot,
                                                        Upper_Arm,Lower_Arm,Hand),
                                      phy=egernia.tree, plot=T, summary=T, angles="equal", PCs=c(1,2))

## Importance of components:
##                           PC1    PC2     PC3     PC4     PC5     PC6
## Standard deviation     0.2048 0.1024 0.04778 0.03306 0.02724 0.01997
## Proportion of Variance 0.7365 0.1842 0.04010 0.01920 0.01304 0.00700
## Cumulative Proportion  0.7365 0.9207 0.96077 0.97996 0.99300 1.00000
limb.rotation <- plot.rotations(limb.ie.phy$pca, plot=T)

Elaboration and Innovation Across Clades

Let’s visualize the relationship between primary elaborative and innovative axes.

all.traits$Genus <- sapply(rownames(all.traits), function(x) strsplit(x, "_")[[1]][1])
pca.all <- all.traits %>%
  select(-Genus) %>%
  prcomp()
pca.all.x <- data.frame(pca.all$x)
pca.all.x$Genus <- all.traits$Genus
pca.all.x$Genus_New <- unlist(sapply(pca.all.x$Genus, function(x) 
  if(x=="Bellatorias") paste("Bel_Ege")
  else if(x=="Egernia") paste("Bel_Ege")
  else if(x=="Corucia") paste("Cor")
  else if(x=="Tribolonotus") paste("Tri")
  else if(x=="Cyclodomorphus") paste("Cyc_Til")
  else if(x=="Tiliqua") paste("Cyc_Til")
  else if(x=="Liopholis") paste("Lio")
  else if(x=="Lissolepis") paste("Lis")))

ggplot(data=pca.all.x, aes(x=PC1,y=PC2)) +
  geom_point(aes(color=Genus_New)) +
  geom_smooth(method="lm", se=F, aes(color=Genus_New)) +
  #stat_ellipse(aes(color=Genus_New), lty="dotted") +
  geom_smooth(data=transform(pca.all.x, Genus_New=NULL),
              method="lm", se=F, color="black", lty="dotted") +
  scale_color_brewer(palette="Paired") +
  #scale_color_scico_d(palette="batlow") +
  theme_classic()
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'

You’ll notice that the relationship between the first two PC axes is actually really quite different among the major clades. The overall trend (regression) is indicated by the dotted line (slope = 0). Cyclodomorphus/Tiliqua have a positive slope suggesting increasing tail length with increasing interlimb length. Interestingly we see the opposite in Bellatorias/Egernia and Liopholis.

Elaboration and Innovation Through Time

Above we’ve seen how the relationship between primary axes can change across groups. However, we might be interested to see how this relationship changes through time. We can plot the regression among any taxa living at a given point in time (generally estimated ancestors at nodes).

The below function inn.elab.tt (innovate-elaborate-through-time) will do that for us.

# visualize change in major axis (elaboration) through time for the whole tree
pca.all <- prcomp(anc.all.df)$x
inn.elab.tt(phy=egernia.tree, trait.df=anc.all.df, pca=pca.all, base.color="Reds")

INNOVATE ELABORATE TO WORK THROUGH